Axioms for Rational Reinforcement Learning
نویسندگان
چکیده
We provide a formal, simple and intuitive theory of rational decision making including sequential decisions that affect the environment. The theory has a geometric flavor, which makes the arguments easy to visualize and understand. Our theory is for complete decision makers, which means that they have a complete set of preferences. Our main result shows that a complete rational decision maker implicitly has a probabilistic model of the environment. We have a countable version of this result that brings light on the issue of countable vs finite additivity by showing how it depends on the geometry of the space which we have preferences over. This is achieved through fruitfully connecting rationality with the Hahn-Banach Theorem. The theory presented here can be viewed as a formalization and extension of the betting odds approach to probability of Ramsey and De Finetti [Ram31, deF37].
منابع مشابه
Non-Rational Discrete Choice Based On Q-Learning And The Prospect Theory
When modelling human discrete choice the standard approach is to adopt the rational model. This has been shown, however, to fail systematically under some conditions, which makes evident the need for a better approach. The choice model is however only part of the problem because it does not say how to deal with uncertainty, where learning is necessary. In this regard, some evidences support the...
متن کاملRationality, optimism and guarantees in general reinforcement learning
In this article, we present a top-down theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much shorter than what the agent designer actually wants. We axiomatize what is rational in such a setting in a manne...
متن کاملCan I Do That? Discovering Domain Axioms Using Declarative Programming and Relational Reinforcement Learning
Robots deployed to assist humans in complex, dynamic domains need the ability to represent, reason with, and learn from, different descriptions of incomplete domain knowledge and uncertainty. This paper presents an architecture that integrates declarative programming and relational reinforcement learning to support cumulative and interactive discovery of previously unknown axioms governing doma...
متن کاملOn characterizations of the fully rational fuzzy choice functions
In the present paper, we introduce the fuzzy Nehring axiom, fuzzy Sen axiom and weaker form of the weak fuzzycongruence axiom. We establish interrelations between these axioms and their relation with fuzzy Chernoff axiom. Weexpress full rationality of a fuzzy choice function using these axioms along with the fuzzy Chernoff axiom.
متن کاملA Modular On-line Profit Sharing Approach in Multiagent Domains
How to coordinate the behaviors of the agents through learning is a challenging problem within multi-agent domains. Because of its complexity, recent work has focused on how coordinated strategies can be learned. Here we are interested in using reinforcement learning techniques to learn the coordinated actions of a group of agents, without requiring explicit communication among them. However, t...
متن کامل